C/C++ Interactive Reference Guide

home *** CD-ROM | disk | FTP | other *** search

/ C/C++ Interactive Reference Guide / C-C++ Interactive Reference Guide.iso / c_ref / csource5 / 363_01 / asm68020.doc < prev next >

Wrap

Text File | 1991-09-26 | 29KB | 730 lines

68020 CROSSASSEMBLER Version 1.0 Autumn 1991 by Andrew E. Romer The information in this document has been carefully checked and is believed to be entirely reliable. No responsibility, however, is assumed for inaccuracies. Index 1. Introduction 2. Assembler command line 3. Source file format 3.1. Empty statement 3.2. Label statement 3.3. Comment statement 3.4. Full statement 4. Statement fields 4.1. Label field 4.2. Operation field 4.3. Operand field 4.4. Comment field 5. Character set 6. Microprocessor instructions 7. Assembler directives 7.1. END 7.2. ORG 7.3. EQU 7.4. SET 7.5. REG 7.6. DC - define constant 7.7. DCB - define constant block 7.8. DS - define storage 8. Expressions 8.1. Numeric values 8.2. Order of precedence 8.3. Operators 9. Assembler processing 10. Addressing modes 10.1. Size 10.2. QUICK instructions 11. Assembler output 12. Assembly listing 13. Error reporting 14. References 1. Introduction. The 68020 Cross-Assembler is an IBM PC (or compatible) program that processes source program statements written in the 68020 Assembly Language and produces machine-readable binary code. This Assembler is an upgrade of the 68000 Cross-Assembler, Ref. 2; it has been designed to conform to the format defined by Motorola (Ref. 1). 2. Assembler command line The Assembler is invoked by the following command line, entered at the DOS prompt: asm [-switches] filename where 'filename' is the name of the source file. If 'filename' has no extension then '.asm' extension is assumed by default. Otherwise the specified extension is used. The following optional switches are valid: c - Generates full hex code listing. Only one line of listing is generated by default. This switch has no effect if the -l switch is not invoked. h - Displays a brief help message on the screen. No other switches have any effect if this switch is invoked, and 'filename' is not processed. l - Enables assembly listing file 'filename.lis' generation. No assembly listing file is generated by default. n - Disables the target code generation. The target code 'filename.h68'is generated by default. The switches can be entered, in arbitrary sequence, without intervening spaces, after the '-' character, or each switch can be entered following a separate '-', in which case separating spaces are required. Therefore the following examples are valid: asm -lc filename asm -cl filename asm -l -c filename The help message can also be displayed by entering: - asm (without arguments), or - asm ? at the DOS prompt. 3. Source file format An assembler source file is an ASCII file which contains a sequence of source statements. The first source statement begins with the first character of the source file and is terminated by the 'newline character', NL; the statements following the first are delimited by NL's. Under DOS NL is a sequence of two characters: Carriage Return CR (hexadecimal 0D) followed by Line Feed LF (hexadecimal 0A). The characters contained between, but excluding, the NL delimiters are the statement context. 3.1. Empty statement A source statement whose context consists exclusively of white space (blanks and horizontal tabs) is called an empty statement. Empty statements do not generate target code, but are included in the assembly listing. 3.2. Label statement A label statement consists of a 'valid first label-character', optionally followed by a sequence of 'valid subsequent label-characters', followed by a colon ':', and, optionally, by white space. Valid first, and subsequent, label characters are defined in the paragraph "label field" below. No white space may precede the label. 3.3. Comment statement A statement beginning with the asterisk ('*') is a comment statement. Comment statements do not generate target code, but their context is included in the assembly listing. Comments can also be included in the comment field of a full statement. 3.4. Full statement A full statement consists of up to 4 fields: label field, operation field, operand field, and comment field. The fields are separated by white space. The label field is optional as a rule. The exceptions from this rule are listed in 7. The operation field is obligatory. The operand field is not always required, and if it is not required then the characters entered in the operand field are regarded as belonging to the comment field. The comment field is optional. 4. Statement fields 4.1. Label field If the first character of a statement line is a white-space character then the label field is empty. Otheriwse it must be a 'valid first label-character', optionally followed by 'valid subsequent label-characters'. Valid first label-character may be any of the following: - letters of the alphabet A...Z, a...z - the underscore _ - the full stop . and valid subsequent label-characters may be: - letters of the alphabet A...Z, a...z - the numerals 0...9 - the underscore _ Upper and lower case letters are not distinct, i.e. name, NAME or nAMe are regarded as identical. Only the first eight characters are significant, i.e. longname, longnameone, longname123 are regarded as identical, but will be passed to the assembly listing unchanged, as will the spelling using upper and lower cases. Using labels differing in the characters beyond the eighth, and labels using different case spelling, is not recommended as it makes the source code more difficult to understand. Certain operations require the label field to be present, and some require it not to be present (7. Assembler directives). If the source line contravenes either of these requirements then an error message, or warning message, is generated by the assembler (see 13. Error reporting). Labels associated with an opcode become equal to the value of the assembler program counter at the time when the source line is read; those associated with directives are defined by the directive itself. The uses of a label include: a symbol in an expression, an address pointer. 4.2. Operation field An operation field is always required in a full statement. An operation, represented by the operation mnemonic, can either be a microprocessor instruction, or an assembler directive. An instruction, together with its operand, if required, will cause the assembler to generate a corresponding binary operation code (opcode) that can be acted upon by the microprocessor. The opcode generated is entered as a sequence of hexadecimal digits in the target and listing files. An assembler directive generates no opcode, it instructs the assembler to follow a specified course of action instead. 4.3. Operand field If the opcode or directive requires an operand then the field immediately following the operation field is the operand field, otherwise it is the comment field. The operand field format depends on the operation it follows. For microprocessor instructions (opcodes) the operand format will be found in Ref. 1, for directives it will be defined together with the directive definition in this manual, see 7. Assembler directives. If an operand is absent, and it is required for an operation, then an error message is issued. 4.4. Comment field The comment field is optional. It generates no code, but is passed to the assembly listing unchanged. It can therefore be used to explain the meaning of the operation it is attached to. Skilful use of comments, whether in comment statements or in comment fields, is an important part of a good programming practice. 5. Character set Except in character strings delimited by single quotes, the assembler does not distinguish between upper and lower cases of letters of the alphabet. All printable characters are recognized by the assembler in quoted strings of characters. In these the single quote character is represented by repetition, to distinguish a single quote from the string terminator: 'It''s a string' is read by the assembler as: It's a string. Characters valid in the label field have been defined in the field's description. The following characters are valid in the operation field: - the letters of the alphabet, - the full stop '.', - the characters '[', ']', ':' (in bit field operations), The operand field recognizes: - the letters of the alphabet, - the decimal digits, - the numeric base designator prefix characters: $, @, %, - the ASCII constant delimiter: ', - the arithmetic operands: +, -, *, /, \, - the Boolean operators: & (AND), ! (OR), ~ (NOT), - shift operators: <<, >> - the special characters: , (comma), : (colon), . (full stop), and the brackets (, ), [, ], {, }, All printable characters are recognized in comments. 6. Microprocessor instructions. All mnemonics defined by Motorola are recognized, as are the size specifiers: .b - byte (8 bits), .s - short (8 bits), .w - word (16 bits), .l - long word (32 bits). .b, .w, and .l are used with operations other than branch operations, .s, .w, and .l with branch operations. It is beyond the scope of this manual to describe the details of mnemonics, the full description can be found in Ref. 1. If no size specifier is present then WORD size '.w' is assumed by default. In branch operations the size of the operation is calculated by the assembler and the smallest size necessary is used, provided that the destination of the branch operation is known when the operation is processed. If the destination is not known then a 16-bit branch is assumed by default. If the assembler finds later that an 8-bit branch would be sufficient then a warning is issued; if the long branch is found necessary then the operation is flagged as an error. 7. Assembler Directives 7.1. END The END directive indicates the end of the source file, any source lines following this directive will be ignored by the assembler. This directive does not require an operand and must not have a label. The use of the END directive is optional; if it is not present the assembler will process source lines until the end of the source file. 7.2. ORG Format: [<label>] ORG <expression> The ORG (origin) directive resets the assembler's program counter to the directive's operand. The operand can be any valid arithmetic expression. The directive may have, but does not require, a label. The label, if present, becomes equal to the directive's operand. 7.3. EQU Format: <label> EQU <expression> The EQU (equate) directive equates the value of the obligatory label to the directive's operand. The operand can be any valid arithmetic expression and must be defined before the point at which the EQU directive appears. The label defined by the EQU directive must not be redefined. 7.4. SET Format: <label> SET <expression> The SET directive equates the value of the obligatory label to the directive's operand. The operand can be any valid arithmetic expression and must be defined before the point at which the SET directive appears. The label defined by the SET directive, unlike the EQU directive, may be redefined. 7.5. REG Format: <label> REG <register range>[/<register range>...] The REG directive equates the obligatory label to the register list, to be handled as a single operand by subsequent MOVEM instructions. The 'register range' operand can either be a single register, <Dn> or <An>, or a range of registers <Dn-Dm>, <An-Am>. The registers and ranges may be specified in any order, thus all the following are identical: D1/D2/D3/A0/A1/A2/A3, D1-D3/A0-A3, A3-A0/D3-RD 7.6. DC - define constant Format: [<label>] DC[.<size>] <item>[,<item>...] This directive will cause aa appropriate number of memory locations to be initialized to the values specified by the consecutive item operands. The optional label is equated to the address of the start of the block, and the size parameter defines the number of bytes allocated for each item. The item argument is an expression defining the value to be placed in the corresponding memory locations. 7.7. DCB - define constant block Format: [<label>] DCB[.<size>] <length>,<value> The optional label is equated to the address of the start of the block. The size code specifies a block of bytes (size code .B), words (.W) or long words (.L). If the size code is omitted then word size is assumed. The length argument can be any non-negative expression, it defines the number of elements in the block. It has to be defined before the statement where the DCB directive appears. The value argument is an expression defining the value to be placed in each element of the block; it does not have to befined beforehand. Word and long word sized elements are placed starting at a word boundary, the assembler will increment the location counter by one if necessary. On the other hand, a block of byte sized elements will start at any location, but the following instruction will be placed at a word boundary, again by incrementing the location counter if necessary. 7.8. DS - define storage Format: [<label>] DS[.<size>] <length> The optional label is equated to the address of the start of the block. The size code specifies a block of bytes (size code .B), words (.W) or long words (.L). If the size code is omitted then word size is assumed. The length argument can be any non-negative expression, it defines the number of elements in the block. It has to be defined before the statement where the DCB directive appears. Word and long word sized elements are placed starting at a word boundary, the assembler will increment the location counter by one if necessary. On the other hand, a block of byte sized elements will start at any location, but the following instruction will be placed at a word boundary, again by incrementing the location counter if necessary. The memory block reserved for storage by the DS directive is not initialized. 8. Expressions An expression is a sequence of numeric values and operators. 8.1 Numeric values The numeric values can be entered as symbols (4.1.), or as explicit values in a decimal, octal, binary, or hexadecimal base. A sequence of decimal digits is assumed to be a decimal value. Explicit values in bases other than decimal are denoted by preceding the sequence of digits with the base designator: binary % octal @ hexadecimal $ The usual conventions apply as regards the digits used in non-decimal bases: binary digits 0, 1 octal 0, 1, 2, 3, 4, 5, 6, 7 hexadecimal 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f or 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F Examples: Valid numeric values are: decimal: 0, 5, 142857 binary: %0, %101, %100010111000001001 octal: @0, @5, @427011 hexadecimal: $0, $5, $3fe, $FF7B All values in an expression, both intermediate and final, are limited to 32 bits. Larger values, whether entered or generated, are reduced modulo (2 to the power of 32) = $100000000. 8.2. Order of precedence Expressions are evaluated in the usual order of precedence: the parenthesized sub-expressions are evaluated first, then the expression is evaluated in the order of precedence of the operators. Precedence of operators is listed below. 8.3. Operators The operators recognized by the assembler are listed below, in groups of precedence. Highest precedence is listed first: 1. Unary minus -, Bitwise negation ~ (one's complement), 2. Left shift << (a<<b results in a shifted b bits, zero filled), Right shift >> (a>>b results in a shifted b bits, zero filled), 3. Bitwise AND &, Bitwise OR ! 4. Multiplication *, Division / (truncated integer division, i.e. 5/3=1), Remainder \ (5\3=2), 5. Addition +, Subtraction -. Operators of the same precedence are evaluated left to right. 9. Assembler processing The assembler processes the source file in two passes. In the first pass: - each label string of characters is stored in the label table, together with its value stored when the definition of the label is encountered, - the number of target code bytes is established for each source line. When the number of target code bytes depends on the operand size, and the operand size is unknown at the time it is encountered (e.g. the operand uses a label which has not been defined yet), then the most pessimistic assumption (i.e. 32-bit size) is made, except for the branch operations. Branch operations of unknown size are assumed to be 16-bit branches. At the second pass the required output (target code, assembly listing) is generated. The assembler will make assumptions in some other cases. For instance MOVE mnemonic will be processed as MOVEA if the operand is an address, or as MOVEQ if the source operand is an immediate value not exceeding 8 bits and the target operand is a data register. 10. Addressing modes All addressing modes of the effective address as specified in Ref. 1. are supported. The following symbols are used to describe the operand formats: Dn = Data Register An = Address Register SP = A7 Xn = Data or Address register used as Index register size = size code (B, W or L) scale = scale in indexed addressing (1, 2, 4 or 8) d8 = 8-bit displacement d16 = 16-bit displacement bd = base displacement (16- or 32-bit) od = outer displacement (16- or 32-bit) ex16 = Expression that evaluates to a 16-bit value ex = Any expression PC = Program Counter The following register names may be also used as operands in certain instructions (e.g.: MOVEC to CCR): SR = Status Register CCR = Condition Code Register USP = User Stack Pointer SFC = Source Function Code Register DFC = Destination Function Code Register VBR = Vector Base Register CACR = Cache Control Register CAAR = Cache Address Register MSP = Master Stack Pointer ISP = Interrupt Stack Pointer Effective Address Modes Mode Assembler Format --------------------------------------------- ---------------- Data Register Direct Dn Address Register Direct An Address Register Indirect (An) Address Register Indirect with Postincrement (An)+ Address Register Indirect with Predecrement -(An) Address Register Indirect with Displacement (d16,An) Address Register Indirect with Index (8-bit displacement) (d8,An,Xn.size*scale) Address Register Indirect with Index (base displacement) (bd,An,Xn.size*scale) Memory Indirect Post-Indexed ([bd,An],Xn.size*scale,od) Memory Indirect Pre-Indexed ([bd,An,Xn.size*scale],od) Program Counter Indirect with Displacement (d16,PC) Program Counter Indirect with Index (8-bit displacement) (d8,PC,Xn.size*scale) Program Counter Indirect with Index (base displacement) (bd,PC,Xn.size*scale) Program Counter Memory Indirect Post-Indexed ([bd,PC],Xn.size*scale,od) Program Counter Memory Indirect Pre-Indexed ([bd,PC,Xn.size*scale],od) Absolute Long Address (ex).L Absolute Short Address (ex16).W Immediate Data #ex In all non-indexed modes SP can be used in place of A7. The following special cases of the above addressing modes, while legal, are not supported: - Memory Indirect with all elements suppressed ([],,) and ([,],) - Memory Indirect without index ([An],,) and ([An,],) - Memory Indirect without base register ([],Xn,) and ([,Xn],) - PC Memory Indirect with all elements suppressed ([ZPC],,) and ([ZPC,],) - PC Memory Indirect without index ([PC],,) and ([PC,],) - PC Memory Indirect without base register ([ZPC],Xn,) and ([ZPC,Xn],) 10.1. Size Many instructions, and directives, can have their size specified. The size specification is coded by appending .B for 8-bit, W for 12-bit, and .L for 32-bit size. The only exception is the set of the branch instructions where the three sizes are coded as .S, .W, and .L, respectively. When the size specification is omitted, the assembler tries to make a guess: if the value of the expression is known at the first pass the assembler will allocate size as appropriate, otherwise it will allocate the 32-bit size as the most pessimistic assumption. Again the branch instructions are handled differently - 16-bit branches are assumed by default. At the second pass all expression values are known and the assembler will flag excessive branch sizes as warnings, and 32-bit branches with 16-bit sizes as errors. Not all microprocessor instructions accept all three sizes. Refer to Ref. 1. for details. 10.2. QUICK instructions The three "quick" instructions: ADDQ, MOVEQ, and SUBQ are automatically selected by the assembler when the addressing mode and operand value conform to the "quick" version requirements: move.l #<data>,Dn will be coded as moveq #<data>,Dn if <data> is known at the first pass and does not exceed 8 bits, and similarly add[.size] #<data>,<ea> sub[.size] #<data>,<ea> will be coded as addq #<data>,<ea> subq #<data>,<ea> if <data> is known at the first pass and falls within the range of 1 to 8.. The MOVEQ instruction will accept both signed and unsigned operands, i.e. <data> within -128 and 255. Naturally, negative operands are coded as corresponding positive operands above 127, according to the rules of "two's complement", e.g. decimal values -55 and +201 will both be coded as hexadecimal C9. 11. Assembler output The target code is stored in the file bearing the same file name as the source file, but with the extension changed to .H68. The code is stored using the Motorola S-record format. 12. Assembly listing The assembly listing is stored in a file bearing the same name as the source file, but with the extension changed to .LIS. The listing format is as follows: each line starts with a 5-digit decimal line number, generated automatically by the assembler, followed by the 8-digit hexadecimal program counter value. This in turn is followed by the hexadecimal opcode and its argument, after which the source line is appended. The listing is formatted. Whatever the size of the source file fields, the listing allocates the following field sizes, listed in the order of their appearance in the listing line: line number - 6 characters program counter - 9 characters hex opcode and argument - 35 characters label field - 10 characters opcode mnemonic - 8 characters operand - 16 characters comment - remainder of line. Except the line number and the program counter fields, each field's content can exceed its allocated size. If hexadecimal opcode and argument exceed their allocation, then only part of the code is shown, with an ellipsis ("...") appended to indicate that the listing is not complete. To show all code it is necessary to invoke the assembler with the -c option. In this case the remainder of the code is printed in the following line. The fields following the line number and and the progrem counter use up as much space as required, with just one blank separating each field from the next. Obviously it is not convenient to have lines overflowing the printer's line length. In most cases if the source line does not exceed 80 characters then the resulting listing line does not exceed 132 characters. Most printers can accommodate this line length, at least in condensed mode. 13. Error reporting As the assembler processes each source statement it records any errors found. An error is classed according to its severity and recorded in an error buffer. If more than one error is found then the error in the buffer is replaced by the newly found one, provided, that the new error's severity is higher; otherwise the new error is ignored. When the source statement processing is complete, the error in the buffer is reported, both in the assembly listing file in the line following the statement it refers to, and on the screen. There are four severity classes: - severe errors, - errors, - minor errors, - warnings. The format of an assembly listing error message is: ERROR: <error message> for severe errors, errors, and minor errors, or WARNING: <error message> for warnings. The error message appearing on the screen is preceded by: in line <line number>: There are the following error messages, listed according to severity: Severe errors: - Invalid syntax - Invalid opcode - Invalid addressing mode - Label required with this directive - Symbol value differs between first and second pass - This code is not implemented - Short branch to the immediately following instruction is not allowed Errors: - Undefined symbol - Division by zero attempted - Symbol multiply defined - Register list multiply defined - Register list symbol not previously defined - Forward references not allowed with this directive - Block length is less that zero Minor errors: - Invalid size code - Invalid vector number - Branch instruction displacement is out of range or invalid - Displacement out of range - Absolute address exceeds 16 bits - Immediate data exceeds 3 bits - Immediate data exceeds 8 bits - Immediate data exceeds 16 bits - Origin value is odd, location counter set to next higher address - The symbol specified is not a register list symbol - Register list symbol used in an expression - Invalid constant shift count - Invalid label character Warnings: - ASCII constant exceeds 4 characters - Numeric constant exceeds 32 bits - Evaluation of expression could not be completed - Excessive size - Unsized instruction, size ignored - Invalid or illegal size ignored - Invalid size, corrected - MOVEQ instruction constant exceeds 8 bits, Least significant 8 bits used - No message defined The last message is included as a safety measure. It will intercept an illegal internal error code if it were generated because of a malfunction of the assembler. There are also a number of error messages potentially generated by various functions of the assembler. These are included to aid assembler modifications and should only be visible if changes are made to the assembler. 14. References 1. Motorola "MC68020 Users's Manual", MC68020UM/AD REV 1, Second Edition. 2. 68000 Assembler, Version 1.0, written by P. McKee at North Carolina State University, Electrical and Computer Engineering department, released to Public Domain by M. Shaban, 8/4/89.